The objective of this report is to analyze the incidence rates of Guillain-Barré Syndrom (gbs) and anaphylactic shock using the PHARMO database, and to present the results in a clear and concise manner.
The data for this analysis was obtained from the PHARMO database, which is a research institute in the Netherlands that maintains a database of electronic medical records. The Pharmo Data Network is a population-based network of healthcare databases and combines data from different healthcare settings in the Netherlands. These different data sources are linked on a patient level through validated algorithms.
For this research particularly, individuals who were in the study between 01/01/2017 and 31/12/2020 are included. In addition, the study’s inclusion and exclusion criteria consist of three requirements. The first requirement is that individuals must have at least one year of follow-up before the study’s start date. The second requirement is that individuals must have at least one day of follow-up after the study’s start date. Lastly, the study will not consider repeated events, and individuals will be censored at the first occurrence of an event. After fulfilling the aforementioned requirements, we have a study population of N = 675
Prior to the analysis, various tables are prepared and then combined in order to obtain the incidence rates. First the OBSERVATION_PERIOD table was loaded and individuals who are not in the above mentioned study population, will be removed.
After doing this, the PERSONS table was loaded. In this table, an age variable is made with the birth data variable. The age was created with the start of the study period as reference. with this age variable, the following age bands are created: 0-19, 20-39, 40-59, 60-79. The age band 80+ is not created, because the there are no individuals who are older than 67.
The next step in the date preparation, was merging the OBSERVATION_PERIODS and PERSON table. We only kept the individuals who are in the OBSERVATION_PERIODS table.
When the aforementioned tables are merged, Concept sets were created. These sets were created from diagnostic codes from the codesheet. All diagnostic codes and vocabulary of the events used to make incidence rates can be found in Table 1.1
When the aforementioned tables are merged, Concept sets were created. These sets were created from diagnostic codes from the codesheet. All diagnostic codes and vocabulary of the events we will make incidence rates of can be found in Table 1.1.
Figure 1.1: A table
Lastly, the EVENTS table is loaded, and merged with the event codes. If the persons had an event before the start of the study, they are removed from the dataset. If a person has more than one event, the second event is removed.
For the analysis, incidence rates per 1000 person years for the study population are created per event (i.e. Anaphylactic Shock Narrow, Anaphylactic shock broad, and gbs). The formula for the incidence rates can be found below:
\[ \text{Incidence Rates overall} = \frac{Total \ number \ of \ events}{Total \ person-years} * 1000 \]
Afterwards, the incidence rates are stratified by year and age-bands. The formulas for these can be found below: \[ \text{Incidence rates by year} = \frac{total \ number \ of \ events \ for \ each \ year} { person-years \ for \ each \ year} * 1000 \]
\[ \text{Incidence rates by year by age bands} = \frac{Total \ number \ of \ events for \ each \ year \ per \ age \ band}{person-years \ for \ each \ year \ per \ age \ band} * 1000 \]
To assess the technical validation, confidence intervals are used. Exact Poisson confidence limits for the estimated rate are found as the Poisson means, for distributions with the observed number of events and probabilities relevant to the chosen confidence level, divided by time at risk. The relationship between the Poisson and chi-square distributions is employed here (Ulm, 1990):
\[Y_l = \frac{\chi^2_{2Y , a/2}}{2}\] \[Y_u=\frac{\chi^2_2(Y+1), 1-a/2}{2}\]
where Y is the observed number of events, \(Y_l\) and \(Y_u\) are lower and upper confidence limits for Y respectively, \(\chi^2_{v,a}\) is the chi-square quantile for upper tail probability on ν degrees of freedom.
Figure 1.3, shows the forest plots of the incidence rates. A total of 86 events occurred. Of these events 28 were anaphylaxis narrow, 22 were anaphylaxis possible and 36 were GBS. When stratified in to age bands, the person years become lower and there is low confidence. Especially in the high and low bands. Most events occurred in the first three years. In these years around 15 to 20 events The last year of the study has almost no events and a incidence rate confidence interval with zero.
Figure 1.2: Forest plots
Figure 1.3: forest plots 2017